Improving Word Vector with Prior Knowledge in Semantic Dictionary

نویسندگان

  • Wei Li
  • Yunfang Wu
  • Xueqiang Lv
چکیده

Using low dimensional vector space to represent words has been very effective in many NLP tasks. However, it doesn't work well when faced with the problem of rare and unseen words. In this paper, we propose to leverage the knowledge in semantic dictionary in combination with some morphological information to build an enhanced vector space. We get an improvement of 2.3% over the state-of-the-art Heidel Time system in temporal expression recognition, and obtain a large gain in other name entity recognition (NER) tasks. The semantic dictionary Hownet alone also shows promising results in computing lexical similarity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Prosody: Its Knowledge and Appropriate Selection of Equivalents

In translation, choosing appropriate equivalent is essential to convey the right message from source-text to target-text, and one of the issues that may have a determinative role in appropriate equivalent choice is the semantic prosody (SP) behavior of words and the relation existing between the SP of a word and semantic senses (i.e. negativity, positivity or neutrality) of its collocations in ...

متن کامل

Semantic Prosody: Its Knowledge and Appropriate Selection of Equivalents

In translation, choosing appropriate equivalent is essential to convey the right message from source-text to target-text, and one of the issues that may have a determinative role in appropriate equivalent choice is the semantic prosody (SP) behavior of words and the relation existing between the SP of a word and semantic senses (i.e. negativity, positivity or neutrality) of its collocations in ...

متن کامل

Improving Text representations through Probabilistic Integration of Synonymy Relations

The present contribution focuses on the integration of word senses in a vector representation of texts, using a probabilistic model. The vector representation under consideration is the DSIR model, that extends the standard Vector Space (VS) model by taking both occurrences and co-occurrences of words into account. Integration of word senses into the co-occurrence model is done using a Markov R...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

Vector Disambiguation for Translation Extraction from Comparable Corpora

We present a new data-driven approach for enhancing the extraction of translation equivalents from comparable corpora which exploits bilingual lexico-semantic knowledge harvested from a parallel corpus. First, the bilingual lexicon obtained from word-aligning the parallel corpus replaces an external seed dictionary, making the approach knowledge-light and portable. Next, instead of using simple...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016